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FIELD OF THE INVENTION 

The present Invention relates to a method for encoding a digital video signal, said 
5 digital video signal comprising some sets of objects with associated shapes. The invention 
also relates to an encoder, said encoder implementing said method. 

Such a method may be used in, for example, a video communication system for 3D 
video applications within MPEG standards. 

' - . ■ " «=© ' 

10 BACKGROUND OF THE INVENTION ; 

A video communication system typically comprises a transmitter with an encoder 
and a receiver with a decoder. Such a system receives an input digital video signal, encodes 
said signal via the encoder, transmits the encoded signal to the receiver, then decodes the 
transmitted signal via the decoder resulting in an output digital video signal, which is the 

15 reconstructed signal of the input digital video signal. The receiver then displays said output 
digital video signal. A 3D digital video signal comprises some images with some sets of 
objects, which are characterized in particular by some associated shapes and textures. 

Current object encoding schemes rely on the description of a specific shape. 
To allow objects with several connected components and complicated shapes (intersections, 

20 multiples edges), a block-based paradigm has been chosen by the MPEG-4 standard, 
document referred under the MPEG-4 document number w3056 at ISO and entitles 
"Information Technology - Coding of audio-visual objects - Part 2: Visual, ISO/IEC JTC 1/SC 
29 /WG 11, Maui, December 1999". An object is split into several blocks. To make easier the 
identification of said blocks, a system of rectangular bounding box is used, and the smallest 

25 rectangular bounding box is computed. Each block within this bounding box is defined either 
as "in the shape", "out of the shape" or as a "boundary block". For the latter, the distinction 
between "in" and "out" is made at pixel level. One inconvenient of these encoding schemes 
is that the use of the bounding box is good as far as objects are strictly within the image 
frame, i.e. dont touch the image frame; but as soon as the objects are positioned against 

30 the image frame or as soon as their shape has vertical or horizontal lines at its boundaries, 
there are some cases when coding bit cost can be significantly lowered. 

SUMMARY OF THE INVENTION 

Accordingly, it is an object of the invention to provide a method and an encoder for 
35 encoding a digital video signal, said digital video signal comprising some sets of objects with 
associated shapes, which lower the number of bits needed to encode objects which are 



positioned against an image frame and objects the shape of which contains vertical or 
horizontal lines at its boundaries. 

To this end, there is provided a method comprising the steps of: 

- Defining an information for determining if the shape of an object is to be 
encoded or its complement's one, and 

- As a function of tills information, encoding said shape or its complement. 

In addition, there Is provided an encoder comprising information for determining if 
the shape of an object is to' be encoded, or its complement's one, and encoding means for 
encoding said shape or its complement as a function of said information. 



As we will see in detail further on, by encoding the complement of the shape in 
some cases instead of the original shape, the compression efficiency will be improved, as 
less bits wilt be necessary to encode the shape. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Additional objects, features and advantages of the invention will become apparent 
upon reading the following detailed description and upon reference to the accompanying 
drawings in which: 

- Fig. 1 illustrates a video communication system comprising an encoder and a decoder 
according to the invention, 

- Fig. 2 is schematic diagram of the method of encoding according to the invention, 

- Rg. 3 represents an object and its associated shape to be encoded by the method of 
encoding of Fig. 2, 

- Fig. 4 represents the object of Fig. 3, which has been encoded according to a classical 
method of encoding, and 

- Fig. 5 represents the object of Fig. 3, which has been encoded according to a first 
embodiment of the method of encoding of Fig. 2. 

- Rg. 6 represents the object of Rg.3, which has been encoded according to a second 
embodiment of the method of encoding of Rg. 2. 

DETAILED DESCRIPTION OF THE INVENTION 

In the following description, well-known functions or constructions by the person 
skilled in the art are not described in detail since they would obscure the invention in 
unnecessary detail. 

The present invention relates to a method for encoding a digital video signal. 



Such a method may be used within a video communication system SYS for video 
applications in MPEG2 or MPEG4, wherein said video communication system comprises a 
transmitter TRANS, a transmission medium CH and a receiver RECEIV. Said transmitter 
TRANS and said receiver RECEIV comprise an encoder ENC and a decoder DEC respectively. 

In order to transmit efficiently some video signals through the transmission medium 
CH, said encoder ENC applies an encoding on a video signal, then the encoded video signal 
Is sent to a decoder DEC, which decodes said signal. Finally the receiver RECEIV displays 
said video signal. 

A video signal comprises $orri<§ sets of objects usually inside some images I, wherein 
an Image I is represented by a plurality of pixels and said objects have associated shapes. 

The encoder ENC comprises an information FLAG for determining if the shape of an 
object is to be encoded, or its complement's one, and encoding means for encoding said 
shape or its complement as a function of said information FLAG. 

The decoder DEC comprises decoding means for retrieving said information FLAG, 
for decoding said shape or its complement as a function of said information FLAG, and for 
retrieving the shape as a function of said complement if the complement has been decoded. 

The encoding of a video signal is based on a block principle. The smallest rectangle 
that frames an object OBJ is computed. Such rectangle is called a bounding box 
BQUND_BOX. Said bounding box BOUND_BOX is split into blocks B that are encoded. Each 
block has a type, wherein said type can be of u fn the shape", "out of the shape", and 
"boundary block". The bounding box BOUNDJ30X of an object OBJ is also called original 
bounding box. 

The encoding of a digital video signal Is done as following and is illustrated by the 
Fig. 2 and 3. 

In a first step 1), the encoder ENC performs a first process to choose which shape 
of an object OBJ it will encode, the original shape or its complement (step la). In the case 
we choose to encode the complement, in a first embodiment, one can choose to use the 
complement NOT_OBJ of the object OBJ in the image frame or in a second embodiment, 
one can choose the complement NOTJDBJ_BB of the object OBJ within its bounding box 
BOUND_BOX (step lb). 

In a non-limitative embodiment, said first process is done by: 
- Calculating three bounding boxes BOUND_BOX, one for the original object OBJ, 
one for its complement NOTJDBJ, and another one for its complement 



NOT_0&LBB within the bounding box of the object OBJ as shown in Fig. 4, Fig. 
5 and Fig. 6 respectively, 
- Choosing, for the encoding, the shape corresponding to the object OBJ, its 
complement NOTJDBJ or its complement NOT_OBJ_BB within the original 
bounding, which has the smallest bounding box BOUND_BOX. Note that, 
preferentially, NOTOBJJ3B is chosen only if Its bounding box BOUND^BOX Is 
considered sufficiently smaller than the bounding box BOUNDJSOX of the object 
OBJ and the bounding box BOUND_BOX of its complement NOT_pBJ, as it will 
be described hereinafter. 

Note that a bounding box BOUND_BOX have 4 coordinates, which correspond to the 
smallest coordinates Xmin, Ymin and the greatest coordinates Xmax, Ymax in pixels taken 
by the associated object OBJ within an image frame I. Note that these coordinates can also 
be expressed by a position (X, Y), a length and a width for example. 

In the example illustrated in Fig. 3, an object OBJ is represented within an Image I. 
The shape of said object OBJ is the gray area. 

The complement of said object NOT_OBJ is the white area. 

The bounding box BOUND_BOX of the object OBJ Is represented in Fig. 4, whereas 
the bounding box BOUND_BOX of Its complement NOTJ3BJ is represented in Fig. 5. The 
complement NOT_OBJ_BB of said object OBJ within its bounding box is the white area in 
Fig. 4. Its bounding box BOUND_BOX is represented in Fig. 6. One can remark that these 
bounding boxes BOUND_BOX are the rectangles in broken line that frame the object OBJ, 
the complement NOTJDBJ and its complement NOTJDBJJ3B within the original bounding 
box BOUND_BOX respectively. 

In a first non-limitative embodiment, when the bounding box BOUND.BOX of an 
object OBJ is greater than the bounding box BOUND_BOX of its complement NOTJ3BJ, Its 
complement's shape Is encoded. In a second non-limitative embodiment, if the bounding 
box BOUND_BOX of the complement N0T_OBJ_BB of an object OBJ within its bounding box 
BOUNDJ30X is even smaller and if the difference in size of the bounding boxes (of the 
complement's NOT_OBJ_BB one within the original bounding box and the object's OBJ one, 
or the complement's NOTJDBJ one) is considered large enough (for example such that the 
encoding of the coordinates of the original bounding box will take less bits than the 
encoding of more blocks within a larger bounding box BOUND_BOX using the object OBJ or 
its complement NOTJDBJ), the shape of this complement NOT_OBJ_BB within the original 
bounding box BOUND_BOX Is encoded. 



As can be seen fn these Fig. 6, 5, and 4, the bounding box BOUNDJBOX of the 
complement object NOT_OBJ_BB within the original bounding box is the smallest one, then . 
comes the bounding box BOUNDJBOX of the complement object NOTJDBJ and the Original 
bounding box of the object OBJ, respectively. 

Indeed, one can see that in the bounding box BOUNDJBOX of the original object 
OBJ, there are 5 blocks called boundary blocks BJBND and 61 plain blocks of which 16 
blocks out of the shape B_OUT and 45 block in the shape B_JN. 

As for the bounding box BOUNDJBOX of the complement object NOT.OBJ, there are 
as many boundary blocks B_BND as for the original object OBJ, but far less plain blocks 28, 
of which only 1 out of shape blocks B_OUT and 27 in the shape blocks BJHM. 

As for the bounding box BOUND_BOX of the complement object NOT_OBJ_BB within 
the original bounding box, there are as many boundary blocks B_BND as for the original 
object OBJ and the complement object NOT_OBJ, but even less plain blocks that the 
bounding box BOUND.BOX of the complement object NOT_OBJ, 17, of which only 1 out of 
shape and 16 in the shape 

Still, the bounding box BOUND_BOX of the complement NOT_OBJ_BB object within 
the original bounding box is only 11 blocks smaller than the bounding box BOUND_BOX of 
the complement object NOTJDBJ. 

The encoding of these 11 blocks is likely to cost less bits than the encoding of the 
coordinates of the original bounding box BOUNDJBOX if one wants to use the complement 
NOT_OBJJ3B of the object OBJ within the original bounding box. 

Hence, In this example, it will be far more efficient and less expensive In term of bit 
cost to encode the shape of the complement object NOTJDBJ than the original one's OBJ or 
its complement NOT_OB3_BB within the original bounding box's one, as there will be less 
bits used to encode said complement object NOTJ3BJ shape than those used to encode said 
complement object NOT_OBJ_BB shape within the original boundary box plus the 
coordinates of the original boundary box if one uses the complement object NOT_OBJ_BB 
within the original bounding box . 

In a second step 2), the encoding process begins. The encoder ENC encodes all 
the characteristics of an object (whatever original or complements is chosen), in particular 
its associated texture, motion vectors, shape, well known by the person skilled in the art. 

During the encoding process, when it comes to the shape encoding, the information 
FLAG, determining if the shape of an object has been encoded or one of its complements, is 
defined at video object level (VO in MPEG-4). This information Is, for example, a variable 
length (one and two bit words) flag FLAG. If said flag Is equal to 0, the standard coding is 
used, i.e. the shape of the original object OBJ is encoded (step 2c in Fig. 2), whereas if said 



flag is equal to 10, the shape of the complement NOT_OBJ is encoded (step 2b) and if the 
said flag is equal to 11 the shape of the complement NOT_OBJ_BB of said object OBJ within 
its bounding box BOUND.BOX | s encoded along with the coordinates of the bounding box of 
said object OBJ (step 2a). 

In our example, the information FLAG is set to 10 as illustrated in the step 2a) of 

Fig. 2. 

In a third step 3), the encoder ENC encodes the shape of the chosen object, 
either the original one OBJ (step 3c), its complement NOT_OBJ (step 3b) or the shape of its 
complement NOTJ3BJ_BB within the original bounding box BOUND_BOX with the 
coordinates of the bounding box BOUND_BOX of said object OBJ (step 3a). 

In our example, it encodes the shape of the complement object NOTjOBJ as 
illustrated in the step 3b) of Fig. 2. 

Finally, the transmitter TRANS transmit in particular the encoded shape to the 
receiver RECEIV, and thus to the decoder DEC. 

During the decoding process, at the decoder DEC side, the knowledge of the value 
of the information FLAG will tell said decoder DEC what to do. 

If set to zero, this flag FLAG indicates that the original shape was encoded, and as a 
consequence the decoded shape is the standard one. If set to one zero, this flag FLAG 
indicates that the complement of the original shape in the image frame was encoded, and 
that one should compute the complement of the decoded shape in order to retrieve the 
original shape. If set to one one, this flag FLAG indicates that the complement NOT_OBJ_BB 
of the original shape within its bounding box was encoded along with the coordinates of said 
original bounding box and that one should compute the complement of the decoded shape 
within the bounding box defined by the decoded coordinates. 

Note that the method for encoding according to the invention is preferentially 
applied to an original object OBJ that is positioned against an image frame or the shape of 
which contains horizontal or vertical lines at its boundaries i.e. when all or part of said lines 
"meet the boundary box. Thus, it is especially the case when dealing with large objects. In 
case an original object OBJ with no specific boundaries is strictly inside an image frame, i.e. 
doesn't touch the edges of the frame, the classical encoding as described in the MPEG-4 
standard is sufficient 

Therefore, preferentially, the information FLAG is activated, i.e. used, when an 
object OBJ has a bounding box BOUNDJ30X with frontiers in common with the image I 
comprising said object OBJ or the shape of which contains horizontal or vertical lines at its 
boundaries. 
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Thus, one advantage of the present invention is to tell the decoder, and therefore 
the receiver, how to decode the shape of an object. 

Moreover, the use of a flag allows to simply defining the type of the shape of an 
5 object, original or complement, and to encode the shape of the objects within an image in 
an improving efficient way. 

It is to be understood that the present invention is not. United to the aforementioned 
embodiments and variations and modifications rpay be made without departing from the 

10 spirit and scope of the invention as defined in the appended claims. In the respect, the 
following closing remarks are made. 

It is to be understood that the present invention is not limited to the aforementioned 
video application. It can be use within any application using a system for processing a 
signal taking into account shapes of objects. In particular, the invention applies to video 

15 compression algorithms of the other MPEG standards family (MPEG-1, MPEG-2) and to the 
nil H26X family (H261, H263 and extensions, H26I being the latest today, reference 
number Q15-K-59). 

It is to be understood that the method according to the present invention is not 
20 limited to the aforementioned implementation. 

There are numerous ways of implementing functions of the method according to the 
invention by means of items of hardware or software, or both, provided that a single item of 
hardware or software can carries out several functions. It does not exclude that an assembly 
of items of hardware or software or both carry out a function, thus forming a single function 
25 without modifying the method for processing the video signal in accordance with the 
invention. 

Said hardware or software items can be implemented in several manners, such as by 
means of wired electronic circuits.or by means of an integrated circuit that is suitable 
programmed respectively. The integrated circuit can be contained in a computer or in an 

30 encoder. In the second case, the encoder comprises an information for determining if the 
shape of an object is to be encoded, or its complement's one, and encoding means for 
encoding said shape or its complement as a function of said information, as described 
previously, said information or means being hardware or software items as above stated. 
The integrated circuit comprises a set of instructions. Thus, said set of instructions 

35 contained, for example, in a computer programming memory or in an encoder memory may 
cause the computer or the encoder to carry out the different steps of the encoding method. 
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The set of instructions may be loaded into the programming memory by reading a 
data carrier such as, for example, a disk. A service provider can also make the set of 
instructions available via a communication network such as, for example, the Internet. 

Any reference sign in the following claims should not be construed as limiting the 
claim. It will be obvious that the use of the verb "to comprise" and its conjugations do not 
exclude the presence of any other steps or elements besides those defined in any claim. The 
article "a" or "an" preceding an element or step does not exclude the presence of a plurality 
of such elements or steps. 
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CLAIMS 

1. A method for encoding a digital video signal, said digital video signal comprising some 
set of objects (OBJ) with associated shapes, characterized in that it comprises the steps 
of: 

- Defining an information (FLAG) for determining if the shape of an object (OBJ) is to 
be encoded, or its complement's one, and 

- As a function of this information (FLAG), encoding said shape or its complement 

2. A method of processing a digital video signal as claimed in claim 1, characterized in that 
the complement is the complement (NOTjOBJ) of an object (OBJ) in an image frame. 

3. A method of processing a digital video signal as claimed in claim 1, characterized in that 
a bounding box (BOUND_BOX) is associated to an object (OBJ) and the complement is 
the complement (NOT_OBJ_BB) of an object (OBJ) within its bounding box 
(BOUND_BOX). 

4. A method of processing a digital video signal as claimed in claim 3, characterized in that 
it has a further step of encoding the bounding box coordinates of said object (OBJ). 

5. A method of processing a digital video signal as claimed In any preceding claims 1 to 4, 
20 characterized in that the information is activated when an object (OBJ) has a bounding 

box (BOUND_BOX) with frontiers in common with an image comprising said object 
(OBJ). 

6. A method of processing a digital video signal as claimed in any preceding claims 1 to 5, 
25 characterized in that when the bounding box (BOUND_BOX) of an object (OBJ) is 

greater than the bounding box (BOUND.BOX) of its complement (NOTJOBJ, 
NOT_OBJ_BB), Its complement's shape is encoded. 

7. A computer program product for an encoder (ENC), comprising a set of instructions, 

30 which, when loaded into said encoder (ENC), causes the encoder (ENC) to carry out the 

method claimed in claims 1 to 6. 
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8. 



A computer program product for a computer, comprising a set of instructions, which, 
when loaded into said computer, causes the computer to carry out the method claimed 
in claims 1 to 6. 
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9. A method for decoding a digital video signal, said digital video signal comprising some 
set of objects (OBJ) with associated shapes, characterized In that it comprises the steps 
of: 

- Retrieving an Information (FLAG), which determines if the shape of an object (OBJ) 
has been encoded or its complement's one, 

- As a function of said Information (FLAG), decoding said shape or its complement 
(NOTJDBJ, NOT_OBJ3B), and 

- If the complement has been decoded, retrieving the shape as a function of said 
complement (NOTlOBJ, NOTJDBJJ3B). • ."" 

10. An encoder (ENC) for encoding a digital video signal, said digital video signal comprising 
some sets of objects (OBJ) with associated shapes, characterized in that it comprises an 
information (FLAG) for determining if the shape of an object (OBJ) Is to be encoded, or 
Its complement's one, and encoding means for encoding said shape or its complement 
as a function of said information (FLAG). 

11. A decoder (DEC) for decoding a digital video signal, said digital video signal comprising 
some sets of objects (OBJ) with associated shapes, characterized in that it comprises 
decoding means for retrieving an information (FLAG), which determines If the shape of 
an object (OBJ) has been encoded or its complement's one, for decoding said shape or 
its complement as a function of said information (FLAG), and for retrieving the shape as 
a function of said complement if the complement (NOTLOBJ, NOTLOBJJ3B) has been 
decoded. 

12. A video communication system (SYS), which is able to receive a digital video signal, 
comprising a transmitter (REC) with an encoder (ENC) as claimed in claim 10 for 
encoding said video signal, a transmission channel (CH) for transmitting the encoded 
video signal and a receiver (RECEIV) with a decoder (DEC) as claimed in claim 11 for 
decoding said encoded video signal. 
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Method and encoder for encoding a digital video signal 
ABSTRACT 

The present invention relates to a method and an encoder for encoding a digital video 
signal, said digital video signal comprising some sets of objects (OBJ) with associated 
shapes. The invention is characterized in that it comprises the steps of: 

- Defining an information (FLAG) for deteiro^HngJf the shape of ar^gbject (OBJ) is to 
be encoded, or its complement's one, and 

- As a function of this information (FLAG), encoding said shape or its complement. 

Use: encoder in a video communication system 

Reference: Fig. 2 
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